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Abstract 

Noise shaping refers to an analog-to-digital conversion methodology in which quantization error is arranged to lie 
mostly outside the signal spectrum by means of oversampling and feedback. Recently it has been successfully applied 
to more general redundant linear sampling and reconstruction systems associated with frames as well as non-linear 
systems associated with compressive sampling. This chapter reviews some of the recent progress in this subject. 


1 Introduction 

Source coding via quantized linear representations, also known as transform coding, is a classical and well-studied 
subject. Yet it is poorly understood outside the simple setting of orthogonal transforms, namely, for frame-based 
representations. The same can also be said for partially nonlinear representations such as those based on compressive 
sampling. The basic reason for the difficulty in solving the quantization problem for these more general sampling 
and reconstruction systems is the lack of an analog of Parseval’s identity which, more or less, dictates the best quan¬ 
tization strategy for orthogonal systems. While some kind of basic reconstruction stability can be ensured relatively 
easily, these results do not offer correct rate-distortion trade-offs because of their inefficiency in utilizing redundancy, 
especially under constraints that do not allow for high-resolution quantization. 

Redundancy is a key concept of frame-based as well as compressive sampling systems. It can be understood 
in terms of the sampling process (e.g., what part of the coefficient space is taken up with the actual measurements) 
or in terms of the reconstruction process (e.g., which perturbations of the measurements have the smallest effect on 
the reconstruction). Efficient encoding via the first approach is generally not practical because codewords cannot 
be easily placed arbitrarily in the coefficient space. Indeed, quantized measurements are typically required to lie 
on a finite rectangular grid. An alternative approach is then to seek ways of arranging the quantization error in 
the coefficient space to lie in directions that are away from the actual measurements, typically by means of some 
feedback process. Noise shaping is the generic name of this quantization methodology. It has its roots in sigma-delta 
modulation, which is used for oversampled analog-to-digital (A/D) conversion I25ll34l l9l l401 . 

Let us explain the philosophy of noise shaping in more concrete terms. In both frame-based and compressive 
sampling systems, we have a linear sampling operator <I> that can be inverted on a given space X of signals using 
some (possibly nonlinear) reconstruction operator VP. Given a signal x £ X and its sampled version y = to, ordinarily 
we recover jc exactly (or approximately, as in compressive sampling) as 'P(y). In the context of this paper, quantization 
of y will mean replacing it with a vector q which is of the same dimensionality as y and whose entries are chosen from 
some given alphabet .A. The goal is to choose q so that the approximate reconstruction x # := 'i’(q) is as close to x as 
possible as x varies over X. 

In the context of finite frames, <b is a full-rank mxk matrix where m > k, and V P is any left inverse of T>. The 
rows of <f> form the analysis frame and the columns of *P form a synthesis frame dual to this frame. With y = to and 
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x = as above, when y is replaced by a quantized vector q, the reconstruction error e : = x — x # is equal to *P(y — q). 
Therefore the correct strategy to reduce the size of e is not to minimize the Euclidean norm ||y — g|| as memoryless 
scalar quantization (MSQ) does, but to minimize the semi-norm \y— := Il'Tfy — <?)||. In other words, we seek 

q £ A m so that the quantization “noise" y — q is close to ker(*P) in the above sense. This is the basic principle of 
noise shaping. How this goal can be achieved (approximately), i.e., the actual process of noise shaping, as well as 
what noise shaping can offer for source coding are nontrivial questions that will be addressed in this article. 

While the basic principle of noise shaping is formulated above for linear sampling and reconstruction systems, 
its philosophy extends to compressive sampling systems where the reconstruction operator is generally nonlinear. 
The simplest connection is made by considering strictly sparse signals. Let Ej^ denote the nonlinear space of N- 
dimensional vectors which have no more than k nonzero entries. In the context of compressive sampling, <I> is an 
mx N matrix where m <C N, which means that the sampling process is lossy for the whole of M. N . However, note 
that Ej^ is the union of (a large number of) k-dimensional linear subspaces on each of which <t> acts like a frame once 
m > k. This observation opens up the possibility of noise shaping. Indeed, fixing any one of these subspaces V, we 
can envision a noise shaping process associated with any of the linear inverses (duals) of <I> on V. However, it is not 
clear how one might organize all of these individual noise shaping processes, especially given that these subspaces 
are not directly available to the quantizer. What comes to the rescue is the notion of an alternative dual. While 
we formulated noise shaping above as matching the quantization operator to a given dual frame, it is also possible 
to consider matching the dual frame to a given quantization operator. This results in the possibility of “universal” 
quantization processes (i.e., independent of the signal subspace) which become noise-shaping processes for suitable 
alternative duals. Even though finding these suitable alternative duals may require extracting information about the 
signal subspace, this duty purely belongs to the decoder and not the quantizer. 

This article is organized as follows. In Section[2] we review the basics of classical noise shaping in the setting of 
sigma-delta (EA) modulation. In Section]?] we extend the formulation of noise shaping and introduce various notions 
of alternative duals for noise shaping in the setting of frames, followed by their performance analysis for random 
frames in Section]?] We then discuss noise-shaping quantization methods for compressive sampling in Section]?] 


2 Classical noise shaping: Sigma-Delta Modulation 


The Shannon-Nyquist sampling theorem for bandlimited functions provides the natural framework of conventional 
A/D conversion systems. With the Fourier transform normalized according to the “ordinary-frequency" convention 

m--= fj(t)e- 2 ^‘dt, 


let us define the space of bandlimited functions to be all x in Lr{ R) such that x is supported in [—D,D], The 
classical sampling theorem says that any x £ 'Bq can be reconstructed perfectly from its time samples (x(nT))„ g z 
according to the formula 

x(t) = t^ i x(nr)yf(t-nt), (1) 


where T < Tent := tjj, and l jf is any function in L 2 (R)such that 



151 < a, 

l«l >lr- 


( 2 ) 


Hence, if we define the sampling operator (<f>x)„ := x(n t) and the reconstruction operator T'(n) := T£n ;l y/(- — n't) 
(on any space it makes sense), then T' is a left inverse of <I> on “Bq when T and if/ satisfy the conditions stated above. 

The value p := 1/t is called the sampling rate, and p cl -it := I /Trrit = 2D is called the critical (or Nyquist) sampling 
rate. Their ratio given by 



(3) 


is called the oversampling ratio. According to the value of A, A/D converters are broadly classified as Nyquist-rate 
converters (A ~ 1) or oversampling converters (A 2> 1). 

Nyquist-rate converters set their sampling rate p slightly above the critical frequency 2D so that iff may be chosen 
to decay rapidly enough to ensure absolute summability of 0- Given any quantization alphabet A, the (nearly) 
optimal quantization strategy in this (nearly) orthogonal setting is memoryless scalar quantization (MSQ). This means 
that each sample y„ := x(n t) is rounded to the nearest quantization level q„ £ A. This process is also referred to as 
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Figure 1: Illustration of classical noise shaping via EA modulation: The superimposed Fourier spectra of a bandlimited 
signal (in black), and the quantization error signals using MSQ (in red), 1st order EA modulation (in magenta), and 
2nd order EA modulation (in blue). 


pulse-code modulation (PCM). If each sample is quantized with error no more than 8, i.e., ||y — <?||oo < 8, then the 
error signal 

e(t) :=x(t)- ( x ¥q)(t) = T (y n - q„) \j/(t - nr) (4) 

neZ 

obeys the bound Ikll l“ < C8 where C is independent of 8. This is essentially the best error bound one can expect 
for Nyquist-rate converters. Because setting 8 very small is costly, Nyquist-rate converters are not very suitable for 
signals that require high-fidelity such as audio signals. 

Oversampling converters are designed to take advantage of the redundancy in the representation |T]l when t < T cr it- 
In this case, the interpolation operator ¥ has a kernel which gets bigger as r —» 0. Indeed, let y/((;) = 0 for > £ 2 0 . 
It is easily seen that = 0 if 

£ u n e 2nin ^ = 0for ||| < zQq. (5) 

This means that even though y — q may be large everywhere, e = 'F(y — q) can be very small if y—q can be arranged 
to be spectrally disjoint from the (discretized) reconstruction kernel ty. This is the concrete form of noise shaping 
that we briefly discussed in the Introduction. 

The main focus of an oversampling A/D converter is on its quantization algorithm, which has to be non-local to 
be useful, but also causal so that it can be implemented in real time. The assignment of each q n will therefore depend 
on y n as well as a set of values (the states) that can be kept in an analog circuit memory, while meeting the spectral 
constraints on y — q as described in the previous section. EA modulators operate according to these principles. 

As can be seen in 0. the kernel of ¥ consists of high-pass sequences. Hence the primary objective of EA 
modulation is to arrange the quantization error y — q to be an approximate high-pass sequence (see Fig. [T|. This 
objective can be realized by setting up a difference equation, the so-called canonical EA equation, of the form 

y — q = A r u, (6) 

where A denotes the finite difference operator defined by 

(Aw)„ :—w„- w„_i, (7) 

r denotes the “order” of the scheme, and u is an appropriate auxiliary sequence called the state sequence. This 
equation does not imply anything about q without any constraint on u. The most useful constraint turns out to be 
boundedness. 

In practice, the boundedness of u in <j6j has to be attained through a recursive algorithm. This means that given 
any input sequence (y n ), the q n are found by a given “quantization rule” of the form 


and the u n are updated via 


qn — 1 , U n — 2 , ■ • ■ ^n^n— 1 j - - - )3 


n — ^ ( 1) yky 7/; ■ 


(8) 

(9) 
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which is a restatement of & In electrical engineering, such a recursive procedure for quantization is called “feedback 
quantization” due to the role q n plays as a feedback control term in The role of the quantization rule F is to keep 
the system stable, i.e., u bounded. 

Stability is a crucial property. Indeed, it was shown in tm that a stable rth order scheme results in the error bound 

||e|k“ < IMHIV^IIu'A (10) 

where l/A) denotes the rth order derivative of y/. The implicit £2- and the explicit T-dependence of this estimate can 
be replaced with a single A-dependence by setting yr(t) := £2y/()(£2t) where the prototype yfo(^) equals 1 on [—1,1] 
and vanishes for \S, \ > 1 + £o, with £o > 0 fixed. Let Co := || V'oIIl 1 ■ Bernstein’s inequality applied to y/ yields 

||c||l” < Co\\u\\e~n r (l +£o) r A _r , for all A > 1 + Eq. (11) 

With this error bound, there are two goals in progression. The first is to keep u bounded and the second is to 
keep the bound small. Ultimately, the best strategy is to have, for each r, a quantization rule yielding a stable rth 
order scheme, and then for any given A, to choose the best one (i.e., the one with the least error bound). This 
task is significantly complicated by the fact that the bound on u has a strong dependence on r, especially for small 
quantization alphabets A. In general it is not possible to expect this dependence to be less than (cr) r for some 
constant c that depends on the given amplitude range p for x. This growth order is also what is needed to ensure 
that the reconstruction error decays exponentially, i.e., as 2“^ , as a function of A, which is the best possible due to 
Kolmogorov entropy estimates for bandlimited functions (22). The rate p of exponential decay that is achievable by 
the resulting family of schemes is inversely proportional to c, and gets worse as p is increased. The question of best 
achievable accuracy for oversampling converters in this setting remains open. Currently, the best result in the one-bit 
case with A = {— 1,1} yields ||e||z,~ = 0{2~ p ^) where p = k/{ 6e 2 log2) ^ 0.1, and p ~ 0.06. Higher values of p 
can be achieved with more levels in A. For example, if A = {—1,0,1}, then p rises to 0.15 and p to 0.25 1151 . 
These are rigorously proven bounds and the actual behavior of the error based on numerical experiments appears to 
be better. For the details of the quantization rules which result in these exponentially accurate ZA modulators, see 
E2Q31. It has also been shown that no matter how the bits are assigned the rate of the exponential decay cannot 
match that of Nyquist-rate conversion (30). 


3 Generalized Noise-shaping Operators and Alternative Duals of 
Frames for Noise Shaping 

In this section, we will generalize the classical theory of ZA modulation to more general noise-shaping quantizers as 
well as sampling and reconstruction systems. For conceptual clarity, we will separate the process of noise shaping 
from the processes of sampling and reconstruction. While we will present these generalizations in a finite-dimensional 
setting, extensions to infinite-dimensional settings are usually possible. We will also discuss the notion of alternative 
duals of frames which are associated with noise-shaping quantizers. 


3.1 A general framework of noise shaping 

The canonical ZA equation we saw in is a special case of a more general framework of noise shaping. Let A be a 
finite quantization alphabet and J be a compact interval in R. Let h = (hj)p, o be a given sequence, finite or infinite, 
where Iiq = 1. By a noise-shaping quantizer with the transfer filter h, we mean any sequence Q = (Q m )“ of maps 
Q m : J m —> A'", m £ N, where for each y £ J m , the output q := Q m (y ) satisfies 

y — q = h*u (12) 

where it £ R m and 11 u \|< C for some constant C which is independent of m. Here h*u refers to the (finite) convolu¬ 
tion of h and u defined by 

(h*u)„ := ^ hjU n -j, 1 < n < in. 

7>0 

where it is assumed that u„ := 0 for n < 0. Without any reference to a sampling or a reconstruction operator, noise 
shaping in this setting refers to the fact that the “quantization noise” y — q is spectrally aligned with h. Note that the 
operator H : u h * u is invertible on R m for any m, and therefore given any y and q, there exists u £ R m which satisfies 
m this is trivial. However, the requirement that ||u||oo must be controlled uniformly in m imposes restrictions on 
what q can be for a given y\ these solutions are certainly non-trivial to find and may not always exist. 
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The operator H above (defined as convolution by h) is a lower triangular Toeplitz matrix with unit diagonal. With 
this view, let us relax the notion of a noise-shaping quantizer and assume that H is any lower triangular m x m matrix 
with unit diagonal. We will refer to H as a noise-shaping transfer operator where the associated noise-shaping relation 
is given by 

y — q = Hu. (13) 

Suppose we are given a sequence {H m )“ of m x m noise-shaping transfer operators. In this general setting, we say 
that an associated sequence (Q m )” of quantizer maps (for which q := Q m (y) and u is determined by (JT3J) achieves 
noise shaping for (H m ), J , and A, if ||k||„o < C for some constant C independent of in. A slightly weaker assumption 
is to only require that ||m||„, = o(\\Hf/ ||<x,->oo), though we shall not need to work in this generality in this paper. 

In many applications, one works with (//„,)” which are “progressive” (also called “nested”) in the sense that 

Pm ° Pm \ I ° Pm +1 — Hm Q Ptni 


where Pm is the restriction of a vector to its first m coordinates. Convolution is a standard example. In this case, it 
may be natural to require that the are progressive as well. The classical £A modulation we saw in Sectionals 

of this type. However, our general formulation does not impose progressiveness. 

As indicated earlier, noise-shaping quantizers provide non-trivial solutions to ( | 1 3| ) and therefore do not exist 
unconditionally, though under certain suitable assumptions on H, J, and A, they exist and can be implemented via 
recursive algorithms. The simplest is the (non-overloading) greedy quantizer whose general formulation is given 
below: 

Proposition 3.1. Let A := A^g denote the arithmetic progression in R which is of length L, spacing 28. and 
symmetric about 0. Assume that H = I — H, where H is strictly lower triangular, and fl > 0 such that ||//||oo->o<> + 
fl/8 <L. Suppose ||y||o<> < ft. For each n > 1, let 

( n— 1 

yn Hn,n—jUn—j 
7=1 


and 


n— 1 

Un * = yn H" Hn,n—jUn—j tfn- 

7=1 


Then the resulting q satisfies © with ||k||o<> < 8. 


This quantizer is called greedy because for all n , the selection of q„ over A is made so as to minimize \u„\. 
The proof of this basic result follows easily by induction once we note that for any w E [— L8,L8], we have | w — 
rounds (w) | < 8, hence the scalar quantizer round s is not overloaded. For details, see m. Note that the greedy 
quantizer is progressive if (//,„)“ is a progressive sequence of noise-shaping transfer operators. In the special case 
Hu = h*u where !iq = 1, we simply have ||£i||oo-too = ||/t||i — 1. This special case is well-known and widely utilized 
(e.g. (9]|3lliQl[2a). 


3.2 Canonical duals of frames for noise shaping 

The earliest works on noise-shaping quantization in the context of finite frames used EA quantization and focused on 
canonical duals for reconstruction. Before we begin our discussion of these contributions we remind the reader of 
our convention: we identify an analysis frame with (the rows of) its analysis operator and a synthesis frame with (the 
columns of) its synthesis operator. 

Let <t> be a finite frame and y = cfi.T be the frame measurements of a given signal x. Assume that we quantize y 
using a noise-shaping quantizer with transfer operator H. Any left-inverse (dual) *P of <I> gives 

x-'¥q = 'P(y-q) ='VHu. (14) 

Using this expression, and specializing to the case of first order £A quantization, i.e., H = D where D is the lower 
bidiagonal matrix whose diagonal entries are 1 and subdiagonal entries are -1,0] observed that the reconstruction 
error can be bounded as 

m 

I|x-'F?|| 2 < ||k||oo£ ||0F£>)_,'||2 (15) 

1=1 
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( 16 ) 


where ( v VD)j denotes the jth column of T'D. This led 0 to introduce the notion of frame variation 

m 

VarpF) := £ \\Wj~Vj+ 1II2 

7= 1 

with 1 jfj denoting the yth column of VP and y/ m +\ defined to be zero. Using normalized tight-frames, i.e., frames <f> 
for which = (m/k)I, this resulted in the error bound 

\\x-<tfq\\ 2 < -|M|~Var(<J>*), (17) 

m 

where T* = <t>' denotes the canonical dual of <t> defined (for an arbihary frame <3>) by 

$1 := (<j>* <£)-!<!>*. (1 8 ) 

Subsequently, similarly defined higher-order frame variations were used to study the behavior of higher-order EA 
schemes (e.g., in |2| and j6j) with corresponding generalizations of ( | 1 7| ) and the conclusion that frames with lower 
variations lead to better error bounds. This motivated considering frames obtained via uniform sampling of smooth 
curves in R* (called frame paths). As it turned out, however, this type of analysis based on frame-variation bounds 
does not provide higher-order reconstruction accuracy unless the frame path terminates smoothly. Smooth termination 
of the frame path is not available for most of the commonly encountered frames, and finding frames with this property 
can be challenging. Indeed, designing such frames was a main contribution of J6) which showed a reconstruction error 
bound decaying as m r for rth order EA quantization of measurements using these frames. 

In practice, however, one must often work with a given frame rather than design a frame of their choosing. In 
such cases there are frames, sampled from smooth curves, for which reconstructing with the canonical dual yields 
reconstruction error that is lower bounded by a term behaving like m 1 , regardless of the EA scheme’s order r> 3 
(see, ED for the details). Consequently, to achieve better error decay rates one must seek either different quantization 
or different reconstruction schemes. We will consider both routes to improving the error bounds in what follows. 


3.3 Alternative duals of frames for noise shaping 

The discussion in Section [T2| was based on canonical duals and it involved a particular method to bound the 2-norm 
of the reconstruction error x — 'i'q, assuming u is bounded in the 00 -norm. It is possible to significantly improve the 
reconstruction accuracy by allowing for more general duals, here called alternative duals. To explain this route, we 
return to the general noise-shaping quantization relation O- We assume again that u is known to be bounded in the 
00 -norm, which is essentially the only type of bound available. Hence, the most natural reconstruction error bound is 
given by 

||x-'P 9 || 2 <||'P//|| 00 ^ 2 ||«||- (19) 

With this bound, the natural objective would be to employ an alternative dual *P of <I> which minimizes || l P//|| 00 _ > . 2 . 
An explicit solution for this problem is not readily available mainly because there is no easily computable expression 
for j|A|| 00 _ >2 for a general k x m matrix A, so we replace it by a simpler upper bound. In fact, this was already done in 
( |15| > because we have 

m 

I|a||—> 2 < £ ||AJ 2 ( 20 ) 

7=1 

where again Aj denotes the Jth column of A. (This upper bound is also known to be the L 2 1 -norm of A.) Another 
such bound which is often (but not always) better is given by 

\\A\\o^2<Vm\\A\\ 2 ->2. (21) 

(Indeed, for a large random matrix with standard Gaussian entries, the upper bound in ( |2 1 | l behaves as m + s/rnk 
whereas that of l |20| ) behaves as m\fk. Both of these upper bounds are easily seen to be less than i/m||A||F r , however.) 

With this upper bound, we minimize || V F//|| 2 _ >2 over all alternative duals *P of <t>. Then an explicit solution is 
available and is given by 

'Fff-i :={H~ 1 3>) + //“ I . (22) 

This idea was initially introduced specifically for EA quantization ED Si with the choice H = D r . The resulting 
alternative duals were called Sobolev duals and will be discussed in the next subsection. The above generalized 
version was stated in ED where the notation 'T// and the term “H-dual” were introduced for the right hand side of 
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\22\, but because of a further generalization we will discuss in Section [333] we find it more appropriate to use the 
label H^ 1 . 

Note that the no noise-shaping case of H = I yields the canonical dual. In general, we have 


||'P H -,tf||2^2 = ll(tf~'<i>) + ll2-+2 = 


1 




so that GD and ( |21fr yield the error bound 

ll-K-'Pff-i q\\ 2 < 




(23) 


3.3.1 Sobolev Duals 

In the case of ZA modulation, H is defined by and given in matrix form by D' where the diagonal entries of the 
lower bidiagonal matrix D are 1 and the subdiagonal entries are —1. Because H'f'D' || 2->-2 resembles a Sobolev norm 
on the corresponding alternative dual was called the (rth order) Sobolev dual of 4> in a. In this work, Sobolev 
duals of certain deterministic frames, such as the harmonic frames, were studied. More precisely, a considered 
frames obtained using a sufficiently dense sampling of vector-valued functions on [0,1], which had the additional 
property that their component functions were piecewise C 1 and linearly independent. For such frames, it was shown 
that 

O mia {D- r ®)>c r m r+ i, (24) 

hence with ( |23| ), the reconstruction error using the rth order Sobolev dual satisfies 

\\x-^D~'q\\2 < %||«||°° (25) 

m 

with C r := 1 /c r - Here, for a fixed stable ZA scheme, the constant C r depends only on the order r and the vector¬ 
valued function from which the frame was sampled. The main technique used in [4j to control the operator norm 
II’Pd-'D'H 2_>2 is a Riemann sum argument. The argument leverages the smoothness of the vector-valued functions 
from which the frames are sampled to obtain a lower bound on ||Z) _ ''<bjt ||2 over unit norm vectors x £ R^ and produces 
the stated lower bound (|24j. 

As mentioned before, error bounds similar to © had also been obtained in (6), albeit for specific tight frames. 
Nevertheless, in both ID and (5), the decay of the error associated with ZA quantization is a polynomial function of 
the number of measurements. The significance of this polynomial error decay stems from the fact that for any frame, 
a lower bound on the reconstruction error associated with MSQ is known to decay only linearly in m 1,20]. 

3.3.2 Refined Bounds Using Sobolev Duals 

The analysis of j4j was refined in 3281 in two special cases: harmonic frames, and the so-called Sobolev self-dual 
frames. For these frames, 1281 established an upper bound on the reconstruction error that decays as a root-exponential 
function of the number of measurements. More specifically, for harmonic frames, |28] explicitly bounds the constant 
C r in |25j i and, as in (22} and 3153 , optimizes the ZA scheme’s order rasa function of the number of measurements. 
Quantizing with a ZA scheme of the optimal order r 0 p t (m) and reconstructing with the associated Sobolev dual results 
in a root-exponential error bound 

\\x-'i> D -, opt q\\ 2 <c l e- c ^ k (26) 

where the constants c\ and C 2 depend on the quantization alphabet A.^g an d possibly on k as well. This possible 
dependence on k is absent in the similar bound for Sobolev self-dual frames. Sobolev self-dual frames are defined 
using the singular value decomposition D r = ULV*. Here, the mx k matrix corresponding to a Sobolev self-dual 
frame consists of the k columns of U associated with the smallest singular values of D r . This construction implies 
that the frame admits itself as both a canonical dual and Sobolev dual of order r, hence the name. More importantly, 
this construction also allows one to bound C r in ( |25| ) explicitly and optimize the ZA scheme’s order r to obtain the 
error bound ( |26| >, without any dependence of the constants on k. 

While we have so far discussed deterministic constructions of frames, Gaussian random frames were studied in 
ED, and later, sub-Gaussian random frames in 1291 . We will discuss these random frames extensively in Section 
|4.1| though at this point we note that, like the harmonic and Sobolev self-dual frames, these frames also allow for 
root-exponential error decay when the order of the ZA scheme is optimized. 
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In the context of ZA quantization of frame coefficients using a fixed alphabet A, the number of measurements is 
proportional to the total number of bits. Hence, the error bounds ( |25| ) and {26} can be interpreted as polynomially 
and root-exponentially decaying in the total number of bits. While these bounds are certainly a big improvement over 
the linearly decaying lower bound associated with MSQ, they are still sub-optimal. To see this, one observes that 
the problem of quantizing vectors in the unit ball of M. k with a maximum reconstruction error of e is analogous to 
covering the unit-ball with balls of radius e. A simple volume argument shows that to quantize the unit ball of R* 
with an error of e, one needs at least klog 0 (2) bits. Thus, the reconstruction error can at best decay exponentially 
in the number of bits used. Moreover, since there exists a covering of the unit-ball with no more than elements 
(see, e.g., f32j ). in principle an exponential decay in the error as a function of the number of bits used is possible. 
This exponential error decay is predicated on a quantization scheme that has direct access to x and, more importantly, 
the ability to compare x to each of the approximately £~ k elements of the covering, to assign it an appropriate binary 
label. The reconstruction scheme for this quantization would then simply replace the binary label by the center of 
the element of the covering associated with it. Of course, this setting is markedly different from the noise-shaping 
quantization of frame coefficients considered in this chapter, but it establishes exponential error decay in the number 
of bits as optimal. 

To achieve exponential error decay in the number of bits, m proposed an encoding scheme to follow rth order 
ZA quantization. The encoding scheme consists of using an l x m Bernoulli random matrix B, with £ slightly larger 
than k, to embed the vector D~ r q into a lower dimensional subspace. Since B serves as a distance-preserving Johnson- 
Lindenstrauss embedding (see, 03111). the vector BD r q effectively contains all the information needed for accurate 
reconstruction of x, and it is the only quantity retained. Moreover, the number of bits required to store BD~ r q 
scales only logarithmically in m. Using {BD~ r Q>y as a reconstruction operator (acting on BD~ r q) and employing the 
properties of Johnson-Lindenstrauss embeddings, 1261 shows that the reconstruction error still decays as it would have 
if no embedding had been employed. In particular, this means an error decay of m~ r for the frames discussed in this 
section. Combining these two observations, i.e., logarithmic scaling of the number of bits with m, and polynomial 
decay of the error, 1261 obtains reconstruction error bounds that decay exponentially, i.e., near optimally, in the 
number of bits. 

It turns out that exponential decay of the reconstruction error (in the bit rate or in the oversampling ratio m/k) can 
also be achieved by means of the “plain route” of noise-shaping quantization and alternative dual reconstruction only, 
but with noise-shaping unlike ZA quantization and more like the conventional beta encoding I10II111 . This method, 
called beta duals, is explained next for general frames, and later in Section|4~2|for random frames. 


3.3.3 Further generalizations: l -duals 

Given any mxk matrix <5 whose rows are a frame for M*, consider any pxm matrix V (i.e., not necessarily square) 
such that T't 1 is also a frame for W. k . We will call 


TV := (V4>) f U 


(27) 


the V-dual of <I>. (The square and invertible case of V = H 1 was already discussed at the beginning of this subsec¬ 
tion.) When p < m, we call the V-condensation of <I>. 

With a V-dual, we have 'i’yH = (V^VH so that 


Il’Pvff Il~-f2 < 


0-min(F4>) “ 


Vp\\vh\\~^ 

Omin(V^) 


(28) 


For V = H 1 (and therefore, p = m), combination of G£} with {28} agrees with {23}. However, as shown in im 
optimization of {28} over V can produce a strictly smaller reconstruction error upper bound. A highly effective special 
case is discussed next. 


Beta duals Beta duals have been recently proposed and studied in 110111II . They constitute a special case of V- 
duals, while they relate strongly to classical beta expansions. (See OH G2 for the classical theory of beta expansions, 
and CD for the use of beta expansions in A/D conversion as a robust alternative to successive approximation.) In 
order to illustrate the main construction of beta duals without technical details, our presentation in this article will be 
restricted to certain dimensional constraints as described below. 

Let m> p> k and assume that X' := m/p is an integer. For any jS > 1, let hP be the (length-2) sequence given 
by /7 q = 1 and h = — /3. Define HP to be the X' x X' noise-shaping transfer operator corresponding to hP, and 

v p := ip~ l /T 2 /T 2 ']. 
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We set 


H : = 

'HP 

and V := 

- vP 


HP. 

mxm 

vP _ 


( 29 ) 


In other words, H = I p ® HP and V = I p ® \>P where ® denotes the Kronecker product. It follows that VH = 
Ip ® {vPhP). Since = [0 ■ ■ ■ 0 J3 ^ ], we have ||V//||oo_>«, = jS ^ which, together with and ( |28| ), yields 


\x-^vq\\2 < 


VpMIqq b -v 

0' m in(V , 4') P 


(30) 


For certain special frames, such as the harmonic semi-circle frames, it is possible to set p as low as k and turn the 
above bound into a near-optimal one in terms of its bit-rate m The case of random frames will be discussed in the 
next section. 

In Fig. [2] we illustrate a beta dual of a certain “roots-of-unity” frame along with the Sobolev duals of order 0 (the 
canonical dual), 1, and 2. 
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Figure 2: Comparative illustration of the various alternative duals described in this paper: Each plot depicts the original 
frame in R 2 consisting of the 15 th roots-of-unity along with one of its duals (scaled up by a factor of two for visual 
clarity). For the computation of the alternative duals, the analysis frame was ordered counter-clockwise starting from 
( 1 , 0 ). 


4 Analysis of Alternative Duals for Random Frames 

In this section, we consider random frames, that is, frames whose analysis (or synthesis) operator is a random matrix. 
Certain classes of random matrices have become of considerable importance in high dimensional signal processing, 
particularly with the advent of compressed sensing. One main reason for this is that their inherent independence 
entails good conditioning of not only the matrix, but also its submatrices. Because of the fast growing number of 
such submatrices with dimension, the latter is very difficult to achieve with deterministic constructions. This also 
means, however, that any two frame vectors are approximately orthogonal, so frame path conditions that would imply 
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recovery guarantees using canonical dual frames will almost never hold. For this reason, it is crucial to work with 
alternative duals. We separately consider the two main examples discussed above, Sobolev duals and beta duals. 

4.1 Sobolev duals of random frames 

As noted above, the Sobolev dual of a frame is the dual frame V F that minimizes the expression ||'PD'j|2_>.2> and the 
explicit minimizer is given by with H = D'. By {23}, a bound for the error that arises when using this alternative 
dual to reconstruct is governed by <7 rn ; n (D Thus a main goal of this subsection is to discuss the behavior of this 
minimum singular value. 

The matrix is the product of a deterministic matrix D~ r , whose singular values are known to a sufficient 

approximation, and a random matrix whose singular values are known to be well concentrated. Nevertheless, 
using a product bound does not yield good results, mainly because the singular values of D~ r differ tremendously, 
so any worst case bound will not be good enough. One approach to provide a refined bound is to first provide lower 
bounds for the action of ZT t> on a single vector and then proceed via a covering argument. That is, one combines 
these lower bounds for all of the vectors forming an £-net, obtaining a uniform bound for the net. An approximation 
argument then allows to pass from the net to all vectors in the sphere. In this way, l2Tt obtains the following result 
for Gaussian random frames: 

Theorem 4.1 ( 12 il l. Let <t> be an m x k random matrix whose entries are i.i.d. standard Gaussian variables. Given 
r 6 N and a 6 (0,1), there exist constants strictly positive r-dependent constants c i, c 2 , and c 3 such that if X := 
m/k > (ci lognj) 1 /^ 1-0 ^, then with probability at least 1 — exp(— csmX^ 01 ), 

t^min(H —r< 3?) > C 3 (r);(31) 

In this approach, one explicitly uses the density of the Gaussian distribution. Thus, as soon as the matrix entries 
fail to be exactly Gaussian, a completely different approach is needed. In what follows, we will present the main 
idea of the method used in (29) to tackle the case of random matrices with independent sub-Gaussian entries as 
introduced in the following definition (for alternative characterizations of sub-Gaussian random variables see, for 
example, l 4Tl t. This approach is also related to the RIP-based analysis for quantized compressive sampling presented 
in ll8l (cf. Section[5]below). 

Definition 4.1. A random variable E, is sub-Gaussian with parameter c > 0 if it satisfies IP( | <§ | > t) < e^~ ct for all 
t >0. 

As in the Gaussian case presented in ED, we employ the singular value decomposition D r — UZV* where U 
and V are unitary and Z E R mx ”' is a diagonal matrix with entries ,j| > ■ ■ ■ > s m > 0. Then 

Omin0D- r 3>) = O mm (UZV*<t>) = O min (ZV*<P), 

as U is unitary. Furthermore, for P( : R m —> R^ the projection onto the first £ entries, £ < m, one has in the positive 
semidefinite partial ordering A 

E A P,Z = p t zp;p, A s t Pg. 

Here the first inequality uses that Pi is a projection, the following equality uses that Z is diagonal, and the last 
inequality uses that the diagonal entries of Z are ordered. 

As a consequence, we find that a m j n (D~''<!>) > 5y<7 m j n (V*<I>). For Gaussian matrix entries, this immediately yields 
Theorem |4.1| as standard Gaussian vectors are rotation invariant, so PfV*Q is just a standard Gaussian matrix, whose 
singular value distributions are well understood (see for example I4ll l. Applying the bound for different values of £ 
yield the theorem for different choices of a. 

For independent, zero mean, unit variance sub-Gaussian (rather than Gaussian) matrix entries, one no longer has 
such a strong version of rotation invariance; while the columns of F* < 1> will still be sub-Gaussian random vectors, its 
entries will, in general, no longer be independent. There are also singular value estimates that require only indepen¬ 
dent sub-Gaussian matrix columns rather than independent entries (see again (4ll ). but such bounds require that the 
matrix columns are of constant norm. Even if and hence also F*<F has constant norm columns (such as for example 
for Bernoulli matrices, <F,y 6 ±1), the projection Pg will typically map them to vectors of different length. 

In order to nevertheless bound the singular values, we again use a union bound argument, first considering the 
action on one fixed vector x of unit norm. Then we write 

k m 

\\V*x \\2 = I L x&ji<yPtPtV*) jf <!>ftx,. 

'/= U ,/ =1 
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Thus ||V*<J>x||; is a so-called chaos process, that is, a random quadratic form of the form where E, is a 

random vector with independent entries (in this case, the vectorization of <F). Its expectation is given by 


k m 

mv* x \\2 = L 'L^ 1 ji(ypt p e y*) jj = \\xf 2 t I (vptp e v*)=e, 

i=\j=i 


where the last equality uses the cyclicity of the trace. Its deviation from the expectation can be estimated using the 
following refined version of the Hanson-Wright inequality, which has been provided in 03 (see (24j for the original 
version). 

Theorem 4.2. Let ^ = (£ 1 ,..., E, n ) 6 R" be a random vector with independent components which are sub-Gaussian 
with parameter c and satisfy E§ ( - = 0. Let A be an n x n matrix. Then for every t > 0, 


p {l(^ M ‘ 5 )-E(^,M^}| >t} < 2 exp 


— C 4 min 


? ll M ll2->2 


where C 4 is an absolute constant. 

To obtain a deviation bound for the above setup, we thus need to estimate the Frobenius norm ||M ||p := tr M*M = 
i') (j j') an ^ ^e operator norm ||Af H 2—>2 : = su Pj|yj|,=i 11A^y 11 2 of the doubly-indexed matrix M given by 
M(i j) (i' /) = x i x i' (VPfPfV* ) jji. For the Frobenius norm, we write 


\\M\\f = £ xf4(VPtP e V%, = \\VP* t P,iV*\\f = tr (VP^P,V*VP;P e V*) = l. 


where in the last equality, we used again the cyclicity of the trace, that V is unitary, and that Pf P/i is a projection. For 
the operator norm, we note that 


/ x r 0 ■ ■ • 0 \ 

0 x T ■■■ 0 


M = P { V* 


\0 ••• 0 x r 


so as all these three factors have operator norm 1, the norm of their product is bounded above by 1. On the other 
hand, applying M to the unit norm vector y given by y^ jj = x/V )j yields My = e\ , where e\ is the first standard basis 
vector, showing that the norm is also lower bounded by 1. So one indeed has ||M|| 2->2 = T Combining these bounds 
with Theorem |4.2| yields the following generalization of Theorem |4. 1 | for sub-Gaussian frames. 

Theorem 4.3 ( 1291 ). Let <I> be an m x k random matrix whose entries are zero mean, unit variance, sub-Gaussian 
random variables with parameter c. Given r £ N and a 6 (0,1), there exist constants c = c(r) > 0 and c' = c'{r) > 0 
such that iff := ™ > c ^ then one has with probability at least 1 — exp (—c'mX~ a ) 


Omi n{D r <S)>A“ (r 


(32) 


Combining ( |23| ) for H = D' with the lower bound of © or P2| >, the Sobolev dual reconstruction 'Vn-rq from 
ZA quantized frame coefficients y = <Fx results in the error bound 


\x-'P D -'qh < C(r)A _ “ (r_ s)||„ 


(33) 


Thus the error decays polynomially in the oversampling rate A as long as the underlying EA scheme is stable. 
For the greedy quantization rule, stability follows from Proposition [3T] as long as ||y||oo < jl for a suitable p whose 
range is constrained by the quantization alphabet A L § and r. (It can be easily computed that for H = D', we have 
00—>00 — 2 r 1. Hence we require L > 2' — 1, with the value of S assumed to be adjustable.) If we assume 
that ||jc ||2 < 1 , then controlling ||v||oo amounts to bounding ||<J>|| 2 _><» < || < I > j| 2^2 an d th us to bounding the maximum 
singular value of a rectangular matrix with independent sub-Gaussian entries. This is a well-understood setup, it is 
known that the singular values of such a matrix are well concentrated and one has ||<J>|| 2 -*x> < ||4>||2-y2 = 0(y/m) 
with high probability (see again |4T)). As a consequence, the EA scheme is stable provided L is chosen large enough 
and the quantizer level is adjusted accordingly. We conclude that sub-Gaussian frame expansions quantized using a 
greedy r-th order EA scheme allow for reconstruction error bounds decaying polynomially in the oversampling rate, 
where the decay order can be made arbitrarily large by choosing r large enough. 
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4.2 Beta duals of random frames 


We return to the Gaussian distribution for the analysis of beta duals for random frames. Based on the error bound 
( |30| > derived in Section 3.3.3 it now suffices to give a probabilistic lower bound for (T m ; n (V<I>). Note that the entries 
of the pxk matrix V<P are i.i.d. Gaussian with variance 


: = ^ 


'-+---+P 


-2X 


(34) 


At this point, a choice for the parameter p needs to be made. In 033 , both choices of p = k and p> k were studied 
in detail. The analysis of the former choice is somewhat cleaner, but the strongest probabilistic estimates follow by 
choosing p greater than k. 

We will primarily be interested in the smallest singular value of Rcfi being near zero. For p = k , the following 
well-known result suffices: 

Theorem 4.4 ([[36J, Theorem 3.1], I' 13). Let LI be a k x k random matrix with entries drawn independently from 
(T 2 ). Then for any e > 0, 

P ({oininW < eo/y/kY) < £. 

Meanwhile, the stability of the greedy quantizer with alphabet A L g can be ensured in a way similar to the case 
of Sobolev duals, noting that ||/f||oo-><» = /3. Hence, we know that if j3 + p/8 < L, then ||u||<» < 8. By standard 
Gaussian concentration results, p < A^/rn is guaranteed with probability at least 1 — e~ 2m . Therefore, with (30} and 
Theorem |4.4| in which we set LI = F®, we obtain 

Wx-^vqh <kL£- 1 8p- m / k (35) 


with probability at least 1 — e — e~ 2m , where we have also used the simple chain of inequalities 1 /< |3 < L. The 
value of ft can be chosen arbitrarily close to L with sufficiently large values of 8. However, the optimal choice would 
result from minimizing 8f5~ m / k subject to jS + p/8 = L. For details, see (IT). 

For p> k, we have the following result: 

Theorem 4.5 m Theorem 4.3]). Let p > k and LI be a pxk random matrix whose entries are drawn independently 
from ,,¥((). G 2 ). Then for any 0 < £ < 1, 


P({0mi„(ft)<£CT^p/2}) < 


10 + 8 


\/l°g* 


k 

e p/2 £ p-k _ 


The corresponding error bound 

\\x~'¥ v q\\2<2L£- x 8p- m IP (36) 

now holds with higher probability. The choices £ ~ p r l m /P for small rj and p ~ (1 + r\)k turn out to be good ones. 
For details, again see cm 


5 Noise-shaping Quantization for Compressive Sampling 

Compressive sampling (also called compressed sensing) has emerged over the last decade as a novel sampling 
paradigm. It is based on the empirical observation that various important classes of signals encountered in prac¬ 
tice, such as audio and images, admit (nearly) sparse approximations when expanded with respect to an appropriate 
basis or frame, such as a wavelet basis or a Gabor frame. Seminal papers by Candes, Romberg, and Tao U, and 
by Donoho ED established the fundamental theory, specifying how to collect the samples (or measurements), and 
the relation between the approximation accuracy and the number of samples acquired (“sampling rate”) vis-a-vis the 
sparsity level of the signal. Since then the literature has matured considerably, again focusing on the same issues, i.e., 
how to construct effective measurement schemes and how one can control the approximation error as a function of 
the sampling rate, e.g., see Ifl9) . 

By now compressive sampling is well-established as an effective sampling theory. From the perspective of prac¬ 
ticability, however, it also needs to be accompanied by a quantization theory. Here, as in the case of frames, MSQ 
is highly limited as a quantization strategy in terms of its rate-distortion performance. Thus, efficient quantization 
methods are needed for compressive sampling to live up to its name, i.e., to provide compressed representations in 
the sense of source coding. 

In this section, we will discuss how noise-shaping methods can be employed to quantize compressive samples 
of sparse and compressible signals to vastly improve the reconstruction accuracy compared to the default method of 
MSQ. We start with the basic framework of compressive sampling as needed for our discussion. 
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5.1 Basics of Compressive Sampling 

In the basic theory of compressive sampling, the signals of interest are finite (but potentially high) dimensional 
vectors that are exactly or approximately sparse. More precisely, we say that a signal x in M. N is k-sparse if it is in 
:= {x £ M. n : ||jt||o < k}. Here ||jc||o denotes the number of non-zero entries of x. The signals we encounter in 
practice are typically not sparse, but they can be well-approximated by sparse signals. Such signals are referred to as 
compressible signals and roughly identified as signals x with small <t*(jc )( , the best k-term approximation error of x 
in £ p , defined by 

o*(*k := min ||jr—«||_p- 

z£Z. k 

Compressive sampling consists of acquiring linear, non-adaptive measurements of sparse or compressible signals, 
possibly corrupted by noise, and recovering (an approximation to) the original signal from the compressive samples 
via a computationally tractable algorithm. In other words, the compressive samples are obtained by multiplying 
the signal of interest by a compressive sampling (measurement) matrix. The success of recovery algorithms relies 
heavily on certain properties of this matrix. To state this dependence precisely, we next define the restricted isometry 
constants of a matrix. 

Definition 5.1. The restricted isometry constant (see, e.g., % := %.(4>) of a matrix 4> £ W nxN is the smallest 

constant for which 

( l -n)\\42<\\^42<( l +n)\\x\\i 

for all x £ T,^. 

Suppose that 4> £ W lxN is used as a compressive sampling matrix. Here, m denotes the number of measurements 
and is significantly smaller than N, the ambient dimension of the signal. Let y := 4>.v + w denote the (possibly) 
perturbed measurements of a signal x £ M. N , where the unknown perturbation w satisfies ||w ||2 < £. A crucial result 
in the theory of compressive sampling states that if the restricted isometry constants of 4> are suitably controlled (e.g. 
as originally stated in j8), or more recently as in m which only assumes Yak < \/( a — l)/« for some a > 4/3), then 
there is an approximate recovery Aj (4>,y) of x which satisfies 

||x-Af(4>,y)|| 2 <C£ + Da k (x) ei /Vk. (37) 

Here, Aj(4>,y) is found by mapping y to a minimizer of a tractable, convex optimization problem—which is often 
called the "Basis Pursuit Denoise” algorithm—given by 

Af(4>,y) := argmin||z||i subject to 114k; — y|b < £. 

z 

C and D are constants that depend on 4>, but can be made absolute by slightly stronger assumptions on 4>. 

Note that in the noiseless case, it follows from ((37} that any k-sparse signal can be exactly recovered from its 
compressive samples as A < 1 , (4>,4>.r). In the general case, the approximation error remains within the noise level and 
within the best k-term approximation error of x in t\. Hence the recovery is robust with respect to the amount of 
noise and stable with respect to violation of the exact sparsity assumption. The decoder Aj is a robust compressive 
sampling decoder as defined next. 

Definition 5.2. |29] Definition 4.9] Let e > 0, let m,N be positive integers such that m < N and suppose that 4> £ 
W nxN . We say that A : W' xN x R m —► R^ is a robust compressive sampling decoder with parameters ( k,a , y), k < m, 
and constant C if 

|jc —A(4>,4>jc + e)|| < Ce, (38) 

for all x £ T/jf, \\e \\2 < £, and all matrices 4> with a restricted isometry constant Yak < 7- 

Examples of robust decoders include Aj and its p-norm generalization A^ with 0 < p < 1 03(391, compressive 
sampling matching pursuit (CoSaMP) 1331 , orthogonal matching pursuit (OMP) Il42l . and iterative hard thresholding 
(IHT) 0. See also ED for detailed estimates of the relevant parameters. 

5.2 Noise-shaping Quantization of Compressive Samples 

Even though noise shaping methods are tailored mainly for quantizing redundant representations, perhaps surpris¬ 
ingly, they also provide efficient strategies for quantizing compressive samples (23] E] (29] ED- The approach, 
originally developed in dD specifically for £A quantization, relies on the observation that when the original signal is 
exactly sparse, compressed measurements are in fact redundant frame coefficients of the sparse signal restricted to its 
support. Since then it has been extended for beta encoding and applied to compressible signals as well GS. We start 
with the case of sparse signals. 
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5.2.1 Sparse signals 

Let x £ Ej^ with supp(x) = T and <t> £ R mxW be a compressive sampling matrix. Then, we have 

y = <Fx => y = $>TXT, 


where <f>r is the submatrix of <f> consisting of its columns indexed by T and xj is the restriction of x to T. Accordingly, 
any quantization technique designed for frames could be adopted to compressive sampling as follows: 

Quantization: Since the compressive samples are in fact frame coefficients, apply the noise-shaping quantization 
algorithm directly to the compressive samples y to obtain the quantized samples, say, q. Note that the quantization 
process is blind to the support of the sparse signal as well as to the sampling operator. 

Reconstruction: Reconstruct via the following two-stage reconstruction algorithm. To obtain an estimate x # of x 
from q: 

1. Coarse Recovery: Solve 

x = Ai e (<P l(? ) (39) 

where Eq is an upper bound on || y — q\\2, which depends on the quantization scheme and is known explicitly. 
Note that the decoder Aj S above can be replaced with any robust compressive sampling decoder A. Clearly, by 
( |38| ) ||x — x|| will be small if Eq is small. 

2. Fine Recovery: Obtain a support estimate, T, of x from x. A finer approximation for x is then given by 
reconstructing with an appropriate alternative dual of the underlying frame based on the noise-shaping 
operator that was employed for quantization. 

The success of the two-stage reconstruction algorithm relies on the accurate recovery of the support of x. In 
turn, this can be guaranteed by a size condition on the smallest-in-magnitude non-zero entry of x. To see this, 
note that for all i £ T, the robustness guarantee (38} yields |x, — x| < CEq, which, together with the size condition 
min l 6 7 - |xf | > 2CEq, gives |x,j > CEq. Moreover, by (38} we have [x;| < Ceq for all i £ T c . Consequently, the 
largest-in-magnitude k coefficients of x are supported on T. Thus, we have the following proposition. 

Proposition 5.1. Suppose thatx £ with supp(x ) = T, and let <f> £ R mx/V ' be a compressive sampling matrix so that 
(38} holds for A = Aj° with robustness constant C. Let x be as in (39} where ||d>x — q \\2 < £q ■ If min Ig r |x,j > ICEq, 
then the k largest-in-magnitude coefficients ofx are supported on T. 

By this observation, the coarse recovery stage not only yields an estimate x that satisfies ||x — x ||2 < CEq, but it 
also gives an accurate estimate of the support of x (via the support of the A;-largest coefficients of x). It remains to 
show that reconstruction techniques associated with noise shaping quantization for frames can be used in the fine 
recovery stage to produce an estimate x # that is more accurate than x of the coarse stage. 

When q results from a noise-shaping quantization scheme, accurate recovery based on alternative duals can be 
guaranteed via d 2 }. In particular, suppose that H is the noise transfer operator of the quantizer. Conditioned on 
recovering T, let be the left inverse of as defined in (22} and set x # := 'V H iq. We then have, as before, 


(40) 


" " CTmin(H- 1 d> r )" 

where u is as in (13} . 

Predominantly, compressed sensing matrices d> (hence their submatrices d> 7 -) are random matrices. Thus, to 
uniformly control the reconstruction error via (40} one needs lower bounds on the smallest singular values of the 
random matrices H~ 1 d >7 for all T C [A] {1,..., N}, \T\ = k, as well as a uniform upper bound on 11 u\\ ^. 

We concentrate again on random matrices d> with independent and identically distributed Gaussian or sub- 
Gaussian entries. In these cases, for each fixed support T, d’j- is a rando m fra me of the type considered in Secti on [4 
and a probabilistic lower bound on C7 m ; n (//~ 1 d> 7 -) follows from Theorem 4.1 (for Gaussian entries) and Th 
(for sub-Gaussian entries). 

A uniform lower bound on <7 m j n (H ~ 1 d> 7 ) over all support sets T of size k can now be deduced via a union bound 
over the support sets. Note that to obtain a uniform bound over this rather large set of supports, one requires a 
relatively small bound for the probability of failure on each potential support, and, consequently, a larger embedding 
dimension m as compared to the case of a single frame. An alternative approach based on the restricted isometry 
constant, essentially yielding the same result, can be found in ED- 

The approaches just outlined are general and can be applied in the case of any noise shaping quantizer that allows 
exact recovery of the support of sparse vectors via Proposition |5.1| In the following, however, we focus on the special 
case of rth-order EA quantization, where H = D~ r and we obtain the following theorem. 
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Theorem 5.1 ( !2l 11291 ). Let r £ Z + , fix a £ N, 7 < 1, ami c, C > 0. 77;en f/rere exist constants C \, C 2 , C 3 , C 4 depending 
only on these parameters such that the following holds. 

Fix 0 < oc < 1. Let <t> be an m x N matrix with independent sub-Gaussian entries that have zero mean, unit 
variance, and parameter c, let A be a robust compressive sampling decoder and k £ N is such that 

X:=™> (Ci log {eN/k)^ . 

Suppose that q is obtained by quantizing z £ R^, via the rth order greedy EA scheme with the alphabet A g, 
and with L > —] +2 r + 1. Denote by q the quantization output resulting from <I>z where z £ R^. Then with 

probability exceeding I — 4e k for all x £ E^ having min |xy|>C3 <5: 

jesupp(x) 

(i) The support ofx, T, coincides with the support of the best k-term approximation ofA( 

(ii) Denoting by <Fj- and F the sub-matrix of <f> corresponding to the support of z and its rth order Sobolev dual 
respectively, and by xt £ R^ the restriction ofx to its support, we have 

\\x T -Fq\\2<C A X- a ( r - l ^8. 

We remark that in Theorem the requirement that L> \ K ^ s ' ] + 2 r + 1 ensures stability of the EA scheme 
while min \xj\ > C 3 5 implies accurate support recovery. 

jGsupp(x) 

5.2.2 Compressible signals 

The two-stage reconstruction algorithm for sparse signals presented above applies equally well to noise-shaping 
quantization based on beta encoding as discussed in Section |3.3.3| However, it turns out that for beta encoding there 
is a more powerful reconstruction algorithm which works for compressible signals as well. 

Let <t> now be an m x N compressive sampling matrix, and let H be the m x m noise transfer operator and V be 
the pxm condensation operator as in \29\ , where again, for simplicity, we have assumed that m/p is an integer. Note 
that the associated noise-shaping quantization relation 


<f>x — q = Hu 


implies 


V<t>x-Vq = VHu , 


hence we may consider Vd? as a new condensed measurement matrix and Vq = V<f>x + VHu as the corresponding 
perturbed measurement. As before, 


\\VHu\\ 2 < \\VH\\^ 2 \\u\\°° < \/pP~ m/p \M 


so that if the greedy quantization rule is stable (i.e., ||«||oo < 5), then we can set e : = 
decoder 


( ? ^Af(V<5,V ? )). 


y filp m /P8 and consider the 


As it follows from the discussion of ( |37| ), if for some a > 0, Y2k '■= Y2k( a V&) is sufficiently small (say less than 1/3), 
then we have the estimate 


- Af (V<J>,l/g )|| 2 < Cae 


D 


Gk(x) 1 

Vk 


(41) 


where C and D are now absolute constants. 

For the random (Gaussian) case, the following result is implied by our discussion above and other tools presented 
earlier in this paper (for a more detailed derivation of a similar result, see Id): 


Theorem 5.2. Let be an mx N random matrix whose entries are i.i.d. standard Gaussian variables. Let x £ R^, 
||x ||2 ^ 1. and let q be the result of quantizing the measurements <f>x with the noise transfer operator H from {29} and 
the alphabet A^g where + 2 s/N/8 <L. Assume m> p>k are such that V := m/p is an integer and 


X 


- > C\X'\ogN/k 
k 
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for some numerical constant C\. Let V be the pxm condensation matrix as in (29} and £ := ^fpfi m ' p 8. Then with 
probability exceeding 1 — e~ p / c ' for another numerical constant C[, we have 

\\x-K\{y^Vq)\\ 2 <CL8yfiJ^r m/p +D^^. 

We note that the optimal choice of the auxiliary parameters p and k in the above theorem depends on the success 
probability as well as further information on the amount of compressibility of x. A rule of thumb would be to balance 
the two error terms above corresponding to quantization error and approximation error. Similarly, the choice of p, 
L , and 8 can be optimized. For example, if L > 2 is given and fixed, but 8 is variable, then one would minimize the 
error bound (over p , k , j6 and 5) within a given probabilistic guarantee objective and a priori knowledge on x. 

Finally, we end with the following remark: a recent work |38| shows that it is in fact possible to obtain an approx¬ 
imation from EA quantized compressive samples that is robust to additive noise and is stable for compressible signals. 
This approximation is obtained via a one-stage reconstruction method based on solving a simple convex optimiza¬ 
tion problem. Furthermore, by encoding the quantized measurements via a Johnson- Lindenstrauss dimensionality 
reducing embedding as in | [26il . one obtains near-optimal rate-distortion guarantees in the case of sparse signals. For 
details, see (38}. 
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